Dynamic Backends in Haproxy with Lua
Egnyte is a multi-tenant cloud service provider with a highly distributed backend deployment. Our systems are distributed across multiple datacenters and within each datacenter we have several pods (logical service partitions) that share some of the components like database, search indexes, and application servers.Each customer is located on a specific pod and the first and the foremost task for any incoming request is to identify the pod on which the customer is located based on the host header of the incoming request.The mapping of customer to a pod is located in an internal repository that we call Global Directory Service (GDS). GDS has rest interfaces to query mappings for a given customer.Haproxy servers are used to terminate SSL and also act as a central router for all requests. Haproxy is a fast and efficient open source load balancer. We found HAProxy can do SSL termination and load balance requests to upstream requests for 10K concurrent requests on standard CentOS four core Linux machines easily. HAProxy is based on an event poll mechanism unlike standard Apache, and can thus handle a very high number of concurrent long-running connections.One of the fundamental roadblocks we hit when building this customer facing haproxy is identifying the POD on which the customer would be located. Haproxy has no direct way of building a dynamic backend proxy where the backend is decided at runtime by looking up an external service.We eventually built a lua extension to our haproxy conf to look up GDS and inject it to a haproxy map which could be used to route the requests.Lua function:function get_backend(txn) local host = txn.http:req_get_headers()["host"][0] local sock = core.tcp() sock:connect("127.0.0.1", 6280) sock:send("GET /rest/private/gds/backend/" .. host .. "\r\n") result = sock:receive("*a") sock:close() core.set_map("/tmp/backend.map", host, result)endcore.register_action("get_backend", { "http-req" }, get_backend)/tmp/backend.map is a placeholder map which has a single entry at haproxy startup time, which is then loaded at runtime on demand. Alternatively we can seed the map with initial entries for well known services at startup.The lua function creates a socket connection to GDS service and makes a simple http call, fetches the backend for the given host header and caches it in the map /tmp/backend.map.At runtime, after a few requests, the content of /tmp/backend.map will look like below in haproxy memory:host1.egnyte.com pod0host2.egnyte.com pod1host3.egnyte.com pod3Haproxy Conf:global…. lua-load /home/egnyte/haproxy/lua/code/gds.lua ….frontend luatest use_backend %[hdr(host),lower,map(/tmp/backend.map)] if FALSE # To declare the map http-request lua.get_backend use_backend %[hdr(host),lower,map(/tmp/backend.map)] backend pod0 server pod 192.168.56.101:6280 maxconn 2backend pod1 server pod 192.168.56.102:6280 maxconn 2backend pod2 server pod 192.168.56.103:6280 maxconn 2Haproxy Conf explained:In the haproxy conf, we first load the lua script file. lua-load /home/egnyte/haproxy/lua/code/gds.luaWe define the backend with an always false conditionuse_backend %[hdr(host),lower,map(/tmp/backend.map)] if FALSE # To declare the mapThe above use_backend is necessary for the lua script to be able to see the map and seed it.Next we invoke our lua scripthttp-request lua.get_backendThe lua script seeds the cache at this point by looking up GDS and we route the request to the backend using the map:use_backend %[hdr(host),lower,map(/tmp/backend.map)] So, if the request host header is host1.egnyte.com, request would be routed to pod0 based on the map contents seeded by the lua script.We could further optimize the conf by checking the map before calling the lua script and avoiding the lua call or check for the map value in the lua script before calling GDS. Thus using a basic lua script, we have been able to tweak haproxy to support a truly dynamic backend.